Search CORE

114 research outputs found

Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks

Author: Liu Wei
Luo Jiebo
Luo Wenhan
Ma Lin
Xiong Wei
Publication venue
Publication date: 30/03/2018
Field of study

Taking a photo outside, can we predict the immediate future, e.g., how would the cloud move in the sky? We address this problem by presenting a generative adversarial network (GAN) based two-stage approach to generating realistic time-lapse videos of high resolution. Given the first frame, our model learns to generate long-term future frames. The first stage generates videos of realistic contents for each frame. The second stage refines the generated video from the first stage by enforcing it to be closer to real videos with regard to motion dynamics. To further encourage vivid motion in the final generated video, Gram matrix is employed to model the motion more precisely. We build a large scale time-lapse dataset, and test our approach on this new dataset. Using our model, we are able to generate realistic videos of up to

128\times 128

resolution for 32 frames. Quantitative and qualitative experiment results have demonstrated the superiority of our model over the state-of-the-art models.Comment: To appear in Proceedings of CVPR 201

arXiv.org e-Print Archive

Crossref

Generic multiple object tracking

Author: Luo Wenhan
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/07/2016
Field of study

Multiple object tracking is an important problem in the computer vision community due to its applications, including but not limited to, visual surveillance, crowd behavior analysis and robotics. The difficulties of this problem lie in several challenges such as frequent occlusion, interaction, high-degree articulation, etc. In recent years, data association based approaches have been successful in tracking multiple pedestrians on top of specific kinds of object detectors. Thus these approaches are type-specific. This may constrain their application in scenario where type-specific object detectors are unavailable. In view of this, I investigate in this thesis tracking multiple objects without ready-to-use and type-specific object detectors. More specifically, the problem of multiple object tracking is generalized to tracking targets of a generic type. Namely, objects to be tracked are no longer constrained to be a specific kind of objects. This problem is termed as Generic Multiple Object Tracking (GMOT), which is handled by three approaches presented in this thesis. In the first approach, a generic object detector is learned based on manual annotation of only one initial bounding box. Then the detector is employed to regularize the online learning procedure of multiple trackers which are specialized to each object. More specifically, multiple trackers are learned simultaneously with shared features and are guided to keep close to the detector. Experimental results have shown considerable improvement on this problem compared with the state-of-the-art methods. The second approach treats detection and tracking of multiple generic objects as a bi-label propagation procedure, which is consisted of class label propagation (detection) and object label propagation (tracking). In particular, the cluster Multiple Task Learning (cMTL) is employed along with the spatio-temporal consistency to address the online detection problem. The tracking problem is addressed by associating existing trajectories with new detection responses considering appearance, motion and context information. The advantages of this approach is verified by extensive experiments on several public data sets. The aforementioned two approaches handle GMOT in an online manner. In contrast, a batch method is proposed in the third work. It dynamically clusters given detection hypotheses into groups corresponding to individual objects. Inspired by the success of topic model in tackling textual tasks, Dirichlet Process Mixture Model (DPMM) is utilized to address the tracking problem by cooperating with the so-called must-links and cannot-links, which are proposed to avoid physical collision. Moreover, two kinds of representations, superpixel and Deformable Part Model (DPM), are introduced to track both rigid and non-rigid objects. Effectiveness of the proposed method is demonstrated with experiments on public data sets.Open Acces

Spiral - Imperial College Digital Repository